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Abstract 

We study the regularity of solutions to an optimal transporta- 
tion problem where the dimension of the source is larger than that 
of the target. We demonstrate that if the target is c-convex, then 
the source has a canonical foliation whose co-dimension is equal to 
the dimension of the target and the problem reduces to an optimal 
transportation problem between spaces with equal dimensions. If the 
c-convexity condition fails, we do not expect regularity for arbitrary 
smooth marginals, but, in the case where the source is 2-dimensional 
and the target is 1 dimensional, we identify sufficient conditions on 
the marginals and cost to ensure that the optimal map is continuous. 

1 Introduction 

vvLet X and Y be smooth manifolds of dimensions m and n, endowed with 
Borel probability measures /i and u, respectively. We say that a Borel map 
F : X — > Y pushes \x forward to v if for all Borel sets A C Y we have 
v(A) = For a given cost function c : X x Y — > K, Monge's 
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optimal transportation problem is then to find the Borel map F pushing \x 
forward to v that minimizes the total transportation cost: 



c(x, F(x))dfi 



(1) 



This can be viewed as a stricter version of the Kantorovich optimal trans- 
portation problem, which is to minimize: 



among all Borel probability measures 7 on X x Y such that the projec- 
tions of 1x7 onto X and Y push 7 forward to /i and u, respectively. In 
fact, the usual method for finding solutions to Monge's problem is to first 
find the Kantorovich solution; one can then show that, under certain condi- 
tions, the solution 7 is concentrated on the graph of a function F : X — > Y 
[20j[15j[2j[16j[6j. These conditions cannot generally hold if m < n; in this 
case, however, there are known conditions under which 7 will concentrate on 
the graph of a function H : Y — > X and so it is preferable to reformulate 
Monge's problem in terms of maps from Y to X. When minimizing (pQ), then, 
it is natural to restrict our attention to the case when m > n. 

Monge's problem has numerous applications and has received a lot of 
attention from many different authors. Questions about the existence and 
uniqueness of optimal maps have been resolved for a wide class of cost func- 
tions; much of the present research in optimal transportation aims to under- 
stand the structure of these optimizers. A great deal of progress has been 
made in this direction, but it has mostly been restricted to the case when 
m — n; problems where m > n, on the other hand, have received very lit- 
tle attention. Aside from being a natural mathematical generalization of 
the relatively well understood m = n case, however, optimal transportation 
problems where m and n fail to coincide may have important applications; 
for example, in economics, optimal transportation type problems arise fre- 
quently and there is often no compelling reason to assume that m = n. For 
a treatment of a related problem in an economic context, see [T] and [9]; the 
connections between these results and the present work will be explored by 
the present author in a separate paper. 

In the m = n case, understanding the regularity, or smoothness, of the 
optimal map, has grown into an active and exciting area of research in the 
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past few years, due to a major breakthrough by Ma, Trudinger and Wang 
[25] . They identified a fourth order differential condition on c (called (A3S) 
in the literature) which implies the smoothness of the optimizer, provided 
the marginals fi and v are smooth. Subsequent investigations by Trudinger 
and Wang (28J [29] revealed that these results actually hold under a slight 
weakening of this condition, called (A3W), encompassing earlier results of 
Caffarelli [SI SI IS] , Urbas [30] and Delanoe [3 E] when c is the distance 
squared on either M. n on certain Riemannian manifolds and Wang for another 
special cost function [31]. Loeper [22] then verified that (A3W) is in fact 
necessary for the solution to be continuous for arbitrary smooth marginals 
H and v. Loeper also proved that, under (A3S), the optimizer is Holder 
continuous even for rougher marginals; this result was subsequently improved 
by Liu [21], who found a sharp Holder exponent. Since then, many interesting 
results about the regularity of optimal transportation have been established 

HBiiis]p]pii2]ii3]iii]in3in]. 

This article focuses on adapting these results to the m > n setting. A 
serious obstacle arises immediately; the regularity theory of Ma, Trudinger, 
and Wang requires invertibility of the matrix of mixed second order partials 
( a ®iQ y j and its inverse appears explicitly in their formulations of (A3W) 

and (A3S). When m and n fail to coincide, however, (g^jj)ij clearly cannot 
be invertible. Alternate formulations of the (A3W) and (A3S) that do not 
explicitly use this invertibility are known; however, they rely instead on local 
surjectivity of the map y h-> D x c(x,y), which cannot hold in our setting 
either. 

Nonetheless, there is a certain class of costs for which our problem can 
easily be solved using the results from the equal dimensional setting. Suppose 

c(x,y)=b(Q(x),y), (2) 

where Q : X — > Z is smooth and Z is a smooth manifold of dimension n. 
In this case, it is not hard to show that the optimal map takes every point 
in each level set of Q to a common y and studying its regularity amounts to 
studying an optimal transportation problem on the n-dimensional spaces Z 
and Y . We will show that costs of this form are essentially the only costs 
onIx7 for which we can hope for regularity results for arbitrary smooth 
marginals /x and v. Indeed, for the quadratic cost on Euclidean domains, 
the regularity theory of Caffarelli requires convexity of the target Y [3J[4J 
and, for general costs, it became apparent in the work of Ma, Trudinger and 
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Wang [25] that continuity of the optimizer cannot hold for arbitrary smooth 
marginals unless Y satisfies an appropriate, generalized notion of convexity. 
Due to its dependence on the cost function, this condition is referred to as c- 
convexity; when m > n, we will show that c-convexity necessarily fails unless 
the cost function is of the form alluded to above. 

In the next section, we will introduce preliminary concepts from the regu- 
larity theory of optimal transportation, suitably adapted for general values of 
m > n. In the third section, we prove that c-convexity implies the existence 
of a quotient map Q as discussed above. We then show that the properties 
on Z which are necessary for the optimal map to be continuous follow from 
analogous properties on X. 

Given the preceding discussion, it is apparent that for cost functions that 
are not of the special form (T5|), there are smooth marginals for which the 
optimal map is discontinuous. However, as the condition (T5]) is so restrictive, 
it is natural to ask about regularity for costs which are not of this form; any 
result in this direction will require stronger conditions on the marginals than 
smoothness. In the final section of our paper, we address this problem when 
m = 2 and n — 1. 

Acknowledgment: The author is pleased to thank Robert McCann and 
Paul Lee for fruitful discussions during the course of this work. 

2 Conditions and definitions 

Here we develop several definitions and conditions which we will require in 
the following sections. We begin with some basic notation. In what follows, 
we will assume that X and Y may be smoothly embedded in larger mani- 
folds, in which their closures, X and Y, are compact. If c is differentiable, 
we will denote by D x c(x,y) its differential with respect to x. If c is twice 
differentiable, D 2 xy c{x, y) will denote the map from the tangent space of Y at 
y, T y Y, to the cotangent space of X at x, T*X, defined in local coordinates 
by 

_D_ ^ d 2 c(x,y) dxi 
Qyi dy l dxi 

where summation on j is implicit, in accordance with the Einstein summation 
convention. D y c(x, y) and Dy X c(x, y) are defined analogously. 

A function u : X — )■ W 1 is called c-concave if u(x) = inf yg y c(x, y) — u c (y), 
where u c (y) := inf xg x c(x, y) — u(x). 
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Next, we introduce the concept of c-convexity, which first appeared in 
Ma, Trudinger and Wang. 

Definition 2.1. We say domain Y looks c- convex from x G X if D x c(x, Y) = 
{D^c{x,y)\y G Y} is a convex subset of T X X . We say Y is c-convex with 
respect to X if it looks c-convex from every x G X . 

Our next definition is novel, as it is completely irrelevant when m = n. 
It will, however, play a vital role in the present setting. 

Definition 2.2. We say domain Y looks c-linear from x G X if D x c(x, Y) is 
contained in an n-dimensional, linear subspace ofT x X. We say Y is c-linear 
with respect to X if it looks c-linear from every x G X . 

When m = n, c-linearity is automatically satisfied. When m > n, this is 
no longer true, although c-convexity clearly implies c-linearity. 

We will also have reason to consider the level set of x \— > D y c(x, y) passing 
through x, L x (y) := {x G X : D y c(x,y) = D y c(x,y)}. 

Let us now state the first three regularity conditions introduced by Ma, 
Trudinger and Wang. 

(AO): The function ceC 4 (Xx¥). 

(Al): (Twist) For all x G X, the map y h- > D x c(x,y) is injective on Y. 
(A2): (Non- degeneracy) For all x G X and y G Y, the map D 2 xy c[x,y) : 
T y Y ->■ T*X is injective. 



Remark 2.3. When m = n, a bi-twist hypothesis is required to prove regular- 
ity of the optimal map; in addition to (Al), one must assume x h> D y c(x,y) 
is injective on X for all y G Y. Clearly, such a condition cannot hold if 
m > n; in fact, the non- degeneracy condition and the implicit function the- 
orem imply that the level sets L x (y) of this mapping are smooth m — n di- 
mensional hyper surf aces. Later, we will assume that the these level sets are 
connected. When m = n, non- degeneracy implies that each L x (y) consists of 
finitely many isolated points, in which case connectedness implies that it is 
in fact a singleton, or, equivalently, that x h-> D y c(x, y) is injective. 

The statements of (A3W) and (A3S), the most important regularity 
conditions, require a little more machinery. For a twisted cost, the mapping 
y !->■ D x c(x, y) is invertible on its range. We define the c-exponential map at 
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x, denoted by c-exp x (-), to be its inverse; that is, D x c(x,c-exp x (p)) = p for 
all p G D x c(x, Y). 

Definition 2.4. Let x G X and y G Y . Choose tangent vectors u G T X X 
and v G T y Y . Set p = D x c(x,y) G T*X and q = {D 2 xy c{x,y)) ■ v G T*X; 
note that if Y looks c-linear at x, p + tq G D x c(x, Y) for small t. For any 
smooth curve f3(s) in X with /3(0) = x and ^f(O) = u, we define the Ma, 
Trudinger Wang curvature at x and y in the directions u and v by: 

3 <9 4 c 

MTW xy (u,v) := --^-^c(f3(s),c-exp x (p + tq)) 

We are now ready to state the final conditions of Ma, Trudinger and 
Wang. Because they are designed to deal with the general case m > n, our 
formulations look somewhat different from those found in [25]; when m = n, 
they reduce to the standard conditions. 

(A3W): For all x G X, y G Y, u G T X X and v G T y Y such that u ■ 
D 2 xy c(x,y) • v = 0, MTW xy (u,v) > 0. 

(A3S): For all x G X, y G Y, u G T X X and v G T y Y such that u-(Dl y c(x, y))- 
v = 0, u • (D 2 xy c(x, y)) ^ and v ^ we have MTW xy (u, v) > 0. 

If m = n, non- degeneracy implies that the condition u • (D 2 y c(x, y)) ^ 
is equivalent to u / 0. 

3 Regularity of optimal maps 

The following theorem asserts the existence of an optimal map. It is due to 
Levin [20] in the case where X is a bounded domain in IR m and [i is absolutely 
continuous with respect to Lebesgue measure. The following version can be 
proved in the same way; see also Brenier [2], Gangbo [15], Gangbo and 
McCann [IB] and Caffarelli [6]. 

Theorem 3.1. Suppose c is twisted and fi(A) = for all Borel sets A C X of 
Hausdorff dimension less than or equal to m — 1. Then the Monge problem 
admits a unique solution F of the form F(x) = c-exp(x, Du(x)) for some 
c-concave function u. 

The following example confirms the necessity of c-convexity to regularity. 
It is due to Ma, Trudinger and Wang in the case where m = n; their proof 
applies to the m > n case as well. 
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Theorem 3.2. Suppose there exists some x G X such that Y does not look 
c-convex from x. Then there exist smooth measures fi and v for which the 
optimal map is discontinuous. 

As c-convexity implies c-linearity, this example verifies that we cannot 
hope to develop a regularity theory in the absence of c-linearity. The following 
lemma demonstrates that, under the c-linearity hypothesis, the level sets 
L x (y) are the same for each y, yielding a canonical foliation of the space X. 

Lemma 3.3. (i) Y looks c-linear from x G X if and only if T x (L x (y)) is 
independent of y; that is T x (L x (y )) = T x (L x (yi)) for all y ,yi G Y. 
(ii) If the level sets L x (y) are all connected, then Y is c-linear with respect 
to X if and only if L x (y) is independent of y for all x 

Proof. We first prove (i). The tangent space to L x (y) at x is the null space 
of the map D yx c(x, y) : T X X y T*Y, which, in turn, is the orthogonal 
complement of the range of D 2 xy c{x,y) : T y Y h-> T*X. Therefore, T x (L x (y)) 
is independent of y if and only if the range of D xy c(x, y) is independent of 
y. But D xy c(x, y) is the differential of the map y i-> D x c(x, y) (making the 
obvious identification between T*X and its tangent space at a point) and so 
its range is independent of y if and only if the image of this map is linear. 

To see (ii), note that (i) implies Y is c-linear with respect to X if and 
only if T x (L x (y )) = T x (Z/ x (yi)) for all x G X and all y ,y 1 G Y. But 
T x {L x (y )) = T x (L x {y^)) for all x is equivalent to L x (y ) = L x (yi) for all x; 
this immediately yields (ii). □ 

For the remainder of this section, we will assume that L x (y) is connected 
and independent of y for all x and we will denote it simply by L x . In this case, 
we will demonstrate now that points in the same level set are indistinguish- 
able from an optimal transportation perspective. The L x s define a canonical 
foliation of X and our problem will be reduced to an optimal transportation 
problem between Y and the space of leaves of this foliation. More precisely, 
we define an equivalence relation on X by x ~ x if x G L x . We then define 
the quotient space Z = Xj ~ and the quotient map Q : X — > Z. Note that, 
for any fixed y$ G Y, the map x h- >■ D y c(x,yo) G T yo Y has the same level 
sets as Q (namely the L x s) and is smooth by assumption. Furthermore, the 
non- degeneracy condition implies that this map is open and hence a quotient 
map. We can therefore identify Z D y c(X ) y ) with a subset of the cotan- 
gent space T*Y . In particular, Z has a smooth structure, and, if c satisfies 
(A0),QisC 3 . 
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Our strategy now will be to show that if F : X — > Y is the optimal map, 
then F factors through Q; F = T o Q. As Q is smooth, this will imply that 
treating the smoothness of F reduces to studying the smoothness of T. To 
this end, we will show that T itself solves an optimal transportation problem 
with marginals a = Q#fi on Z and v on Y relative to the cost function b(z, y) 
defined uniquely by: 

D y b(z,y) = D y c(x,y), for x G Q^{z) 
K z ,Vo) = o 

As Z and Y share the same dimension, the regularity theory of Ma, Trudinger 
and Wang will apply in this context. 

We first obtain a useful formula for the cost function b. 

Proposition 3.4. For any z G Z , y G Y and x G Q~ l (z), we have b(z,y) = 
c(x,y) - c(x,y ). 

Proof. For y — y the result follows immediately from the definition of h. As 
D y b(z, y) = D y c(x, y) for all y, the formula holds everywhere. □ 

Note that this implies c(x,y) = b(Q(x),y) + c(x,y ), which is equivalent 
to b(Q(x),y) for optimal transportation purposes. 

Lemma 3.5. For any x ,Xi G L x , y 6 7 and c-concave u we have u(x ) = 
c(x ,y) - u c (y) if and only if u(xi) = c(x u y) - u c (y). 

Proof. First note that as D y c(xo,y) — D y c(x 1 ,y) = for all y G Y, the 
difference c(xo,y) — c(xi,y) is independent of y. Now, suppose u(xo) = 
c(x ,y) - u c {y). Then 

u{x x ) = inf \c{xx,y) -u c (y) 

yeY 

= inf '(c(x 1 , y ) - c(x , y) + c(x , y) 

yeY 

= c(xi,y) - c(x ,y) + inf (c(x , y) 

yeY 

= c(x 1: y) - c(x ,y) + u(x ) 
= c(xi,y) - u c (y) 

The proof of the converse is identical. □ 



- u%y)) 
-u c (y)) 
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Proposition 3.6. Suppose c is twisted and n doesn't charge sets of Hausdorff 
dimension m — 1. Let F : X -*Y be the optimal map. Then there exists a 
map T : Z — )■ Y such that F = T o Q ; fi almost everywhere. Moreover, T 
solves the optimal transportation problem on Z xY with cost function b and 
marginals a and v . 

Proof. It is well known that there exists a c-concave functions u(x) such that, 
for fi almost every x, there is a unique y G Y such that u(x) = c(x, y) —u c (y); 
in this case, F(x) = y. 

For a almost every z G Z, Lemma I'd. 51 now implies that there is a unique 
y G Y such that u(x) = c(x,y) — u c {y) for all x G Q~ l (z); define T(z) to be 
this y. In then follows immediately that F = T o Q, jj, almost everywhere, 
and that T pushes a to v. 

Now, suppose G : Z — > Y is another map pushing a to v. Then G o Q 
pushes fj, to v and because of the optimality of F = Q o T we have 

/ c(x, T o Q(x))d[i < / c(x, G o Q(x))dfi. (3) 
Jx Jx 

Now, using Proposition 13.41 we have 

c(x, ToQ(x))dfi = / b(Q(x),T a Q(x)) +c(x,y )dfx 

Jx 

b(z,T(z))da+ / c(x,yo)dfi 
z Jx 



Similarly, 



c(x, G o Q(x))dfx — / b(z, G(z))da + / c(x,yo)dfM 
x J z Jx 



and so (13]) becomes 



b(z, T(z))da < \ c(z,G(z))da 
z Jz 

Hence, T is optimal. □ 

Having established that the optimal map F from X to Y factors through 
Z via the quotient Q and the optimal map T from Z to Y, we will now study 
how the regularity conditions (A1)-(A3S) for c translate to b. 
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Proposition 13.41 also allows us to understand the derivatives of b with 
respect to z. Pick a point z G Z and select x G Q~ 1 (^o)- Now, let S be an 
m-dimensional surface passing though xq which intersects L Xo transversely. 
As the null space of the map D yx c(x, yo) : T X X — >■ T*Y is precisely T X L X for 
any y, it is invertible when restricted to T X S; by the inverse function theorem, 
the map D y c(-, yo) restricts to a local diffeomorphism on S. For all z near zo, 
there is a unique x G 5 fl Q -1 ^) and we have b(z , y) = c(x, y) — c(x, y ); we 
can now identify D z b(z, y) « D x c\sxY(x,y)-D x c\ S xY(x,y ) and D 2 zy b(z,y) fa 
Dl y c\sxY(x,y). We use this observation to prove the following result. 

Theorem 3.7. (i) If c is twisted, b is bi-twisted. 

(ii) If c is non-degenerate, b is non- degenerate. 

(iii) IfY is c-convex, it is also b-convex. 

Proof. The injectivity of z i— > D y b(z, y) follows immediately from the the 
definition of b. Injectivity of y i— > D z b(z, y) and non-degeneracy follow from 
the preceding identification. 

Note that transversality implies T*X = T*L X © T*S. Our local iden- 
tification between Z and S identifies the projection of the range D x c(x, Y) 
onto T*S with D z b(z,Y). As the projection of a convex set is convex, the 
6-convexity of Y now follows from its c-convexity. 

□ 

Theorem 3.8. The following are equivalent: 

1. b satisfies (A3W). 

2. c satisfies (A3W). 

3. c satisfies (A3W) when restricted to any smooth surface S C X of 
dimension m which is transverse to each L x that it intersects. 

Proof. The equivalence of (1) and (3) follow immediately from our identi- 
fication. Clearly, (2) implies (3); to see that (3) implies (2) it suffices to 
show MTW xy {u, v) = when MTW xy is linear in u. Choosing 

a curve f3(s) G L x such that /3(0) = x and ^f(O) = u and p, q as in the 
definition, we have 

dB 

— (s) G Tp {s) Lp {s) = nu\l(D 2 xy c((3(s),c-exp x (p + tq))). 
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for all s and t, yielding 



Theorem 3.9. The following are equivalent: 

1. b satisfies (A3S). 

2. c satisfies (A3S). 

3. c satisfies (A3S) when restricted to any smooth surface S C X of 
dimension m which is transverse to each L x that it intersects. 

Proof. The equivalence follows immediately from the identification, after ob- 
serving that the v ■ (D^. y c(x,y)) ^ condition in the definition of (A3S) 
excludes the non-transverse directions. □ 

Various regularity results for T (and therefore F) now follow from the 
regularity results of Ma, Trudinger and Wang [25], Loeper [22] and Liu [2T] . 
Note, however, that these results all require certain regularity hypotheses on 
the marginals; to apply them in the present context, we must check these 
conditions on a, rather than /i. A brief discussion on whether the relevant 
regularity conditions on fi translate to a therefore seems in order. 

First, suppose X is a bounded domain in M. n and \i = f{x)dx is absolutely 
continuous with respect to m-dimensional Lebesgue measure. Then a is 
absolutely continuous with respect to n- dimensional Lebesgue measure with 
density h(z) given by the coarea formula: 



where JQ is the Jacobian of the map Q, restricted to the orthogonal com- 
plement of T X L X . 

Lemma 3.10. Suppose f G L P (X) (with respect to Lebesgue measure on X) 
for some p £ [1, <x>]. Thenh£L p (Z). 



Hence, MTW xy (u,v) = 



□ 
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Proof. We have h p (z) = {jq-xr z \ jQ^ dH™ 1 n (x)) p . Normalizing and apply- 
ing Jensen's inequality yields: 

h?(z) < f f p (x) 



C p {z) J Q -i (z) (JQ(x)yC(z) 

< [ _ dH m - n (x) 



JQ{x)C{z)Kp- 1 



where C(z) is the (m — n)-dimensional Hausdorff measure of Q _1 (z) and 
K > is a global lower bound on JQ(x). Letting C be a global upper bound 
on C(2) and integrating over z implies: 



K ' [z)dz £ / S Q - H ,W^ dH " [x)iz 

C p-i 



qp-1 r 

j / / p (x)dz < 00 



where we have again used the coarea formula in the last step. □ 

Let us note, however, that an analogous result does not hold for the 
weaker condition introduced by Loeper [22], which requires that for all x £ X 
and e > 

li(B € (x)) < Ke n{1 -T^ 

for some p > n and K > 0. Indeed, if m — n > n, we can take // to be 
(m — n)-dimensional Hausdorff measure on a single level set L x . Then [i will 
satisfy the above condition for any p, but a will consist of a single Dirac 
mass. 

The preceding lemma allows use to immediately translate the regularity 
results of Loeper and Liu to the present setting. 

Corollary 3.11. Suppose that Y is c-convex with connected level sets L x (y) 
for all x e X and y £ Y, and that (AO), (Al), (A2) and (A3S) hold. 
Suppose that f £ L P (X) for some p > Then the optimal map is Holder 

continuous with Holder exponent 2n ^+iMn-i) > w ^ ere & — 1 — i^r- 



12 



The higher regularity results of Ma, Trudinger and Wang require C 2 
smoothness of the density h. As the following example demonstrates, how- 
ever, smoothness of / does not even imply continuity of h. 

Example 3.12. Let 

X — {x — (xi, x 2 ) : -1 < xx < 1, — 1 < x 2 < <j>(xi)} C R 2 

where : (—1, 1) — > (—1, 1) is a C°° function such that 0(xi) = for all 
—1<X\< 0, 0(1) = 1 and is strictly increasing on (0,1). Let Y = 
(0,1) C R and c(x,y) = x 2 y. Then Y is c-convex and c satisfies (AO)- 
(A3S). The level sets L x are simply the curves {x : x 2 = c} for constant 
values of c G (—1, 1) and Z = (—1, 1). Set f(x) = k, where k is a constant 
chosen so that \x has total mass 1. The density h is then easy to compute; 
it is simply the length of the line segment Q~ l (z). For z < 0, h(z) = 2k; 
however, for z > 0, h(z) = k(l — _1 (^)) < k. [J 

On the other hand, we should note that is possible for a to be smooth even 
when /i is singular. This will be the case if, for example, fi is n-dimensional 
Hausdorff measure concentrated on some smooth n-dimensional surface S 
which intersects the L^'s transversely. 

Finally, we exploit Loeper's counterexample, which shows that, when 
m = n and (A3W) fails, there are smooth densities for which the optimal 
map is not continuous. 

Corollary 3.13. Suppose that Y is c-convex and that the level sets L x (y) 
are connected for all x G X and y G Y. Assume (AO), (Al), and (A2) 
hold but (A3W) fails. Then there are smooth marginals fi on X and v on 
Y such that the optimal map is discontinuous. 

Proof. Using Proposition I3.4[ it is easy to check that u : X — > R is c-concave 
if and only it u{x) = v(Q(x)) + c(x,yo) for some 6-concave v : Z — > R. By 
[22] . we know that if (A3W) fails, then the set of C , 6-concave functions is 
not dense in the set of all 6-concave functions in the L°°(Z) topology. From 
this it follows easily that the set of C 1 , c-concave functions is not dense in 
the set of all c-concave functions in the L°°(X) topology. The argument in 
[22] now implies the desired result. □ 

1 It should be noted that the while the boundary of X is not smooth here, this is not 
the reason for the discontinuity in h; the corners of the boundary can be mollified and the 
density will still be discontinuous at 0. 
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4 Regularity for non-c- convex targets 



The counterexamples of Ma, Trudinger and Wang, combined with the results 
in the previous section imply that we cannot hope that the optimizer is con- 
tinuous for arbitrary smooth data if the level sets L x (y) are not independent 
of y. It is then natural to ask for which marginals can we expect the optimal 
map to smooth? In this section, we study this question in the special case 
when m = 2 and n — 1. We identify conditions on the interaction between 
the marginals and the cost that allow us to find an explicit formula for the 
optimal map and prove that it is continuous. 

We will assume Y — (a, b) C M. is an open interval and that X is a 
bounded domain in IR 2 . We will also assume that c G C 2 (X x Y) satisfies 
(A2), which in this setting simply means that the gradient V x (|^) never 
vanishes. Therefore, the level sets L x (y) will all be C 1 curves. We define the 
following set: 



P = \x G X : V y < yi G Y, x G L s (y ), we have < d ' U - ' 



dy dy ) 

When the level sets L x (y) are independent of y, P is the entire domain 
X. If not, P consists of points x for which the level sets L x (y) evolve with 
y in a monotonic way. L x (yi) divides the region X into two subregions: 
{ x : Mga) > and {x : < ^f 1 }. x G P ensures that for 

yo < y±, the set L x (yo) will lie entirely in the latter region. For interior points, 
the curves L x (y ) and L x (yi) will generically intersect transversely and so 
L x (y ) will interect both of these regions; therefore, P will typically consist 
only of boundary points. At each boundary point x, we can heuristically 
view the level curves L x (y) as rotating about the point x; P consists of those 
points which rotate in a particular fixed direction. 

In what follows, 7 will be a solution to the Kantorovich problem. The 
support of 7, or spt^), is the smallest closed subset of X x Y of full mass. 

Lemma 4.1. Suppose x G P, x G X, y , y 1 G Y and (x,yi), (x,y ) G spt{^f). 
Then < ifyo < y x and ^Mll > ^§ml if yo > yi 

Proof. The support of 7 is c-monotone (see [27] for a proof); this means that 
c(x, yx) + c(x, y ) < c(x, y ) + c(x, yx). If y <Vi, this implies 

Vl dc(x, y) [* dc(x, y) ( .. 
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Assume > ^fesil. We claim that this implies > for all 

#y ay r ay ay 

y G [?/o,;?/i], which contradicts (BJ. To see this, suppose that there is some 
y G [?/o? suc h that 9c ^ y ^ < ^j^pS the Intermediate Value Theorem then 
implies the existence of a y G such that = > or a; G L x (y). 

This, together with our assumption 9c ^^ > aC g^ yi \ violates the condition 
x G P. 

A similar argument shows dc ^^ > dc ^^ if 2/0 > 2/i- D 
Definition 4.2. We say y splits the mass at x if 

,({*:^ <*&»>>)= KM) 

// n and v are absolutely continuous with respect to Lebesgue measure, this 
is equivalent to 

f dc{x,y) dc(x,y) \ 

Lemma [4.11 immediately implies the following. 

Lemma 4.3. Suppose fi and v are absolutely continuous with respect to 
Lebesgue measure. Then if x G P,y G Y and (x,y) G spt^), y splits the 
mass at x. 

Lemma 4.4. Suppose fi and v are absolutely continuous with respect to 
Lebesgue. Then, for each x G X there is a y G Y that splits the mass at 
x. 

Proof. The function y H> f x (y) ■= ^{x : < ^g^}) - u([0,y)) is 

continuous. Observe that f x (0) > and f x (X) < 0; the result now follows 
from the Intermediate Value Theorem. □ 

Similarly, it is straightforward to prove the following lemma. 

Lemma 4.5. Suppose fi and v are absolutely continuous with respect to 
Lebesgue. Then, for each y G Y there is an x G X such that y splits the 
mass at x if and only if X G L x (y). 
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Definition 4.6. Let x G P. We say x satisfies the mass comparison property 
(MCP) if for all y < y 1 G Y we have 



m( U L s (y)j < v([y ,yi]) 



y£[yo,yi] 

In the case when the level sets L x (y) are independent of y, the MCP is 
satisfied for all x G P = X as long as fi assigns zero mass to every L x (y) and 
v assigns non-zero mass to every open interval. Alternatively, in view of the 
previous section, we know that in this case the cost has the form c(Q(x), y), 
where Q : X — > Z and Z = [zq, z\] C R is an interval; the MCP boils down 
to the assumption that a assigns zero mass to all singletons and v assigns 
non-zero mass to every open interval. 

Lemma 4.7. Suppose /i and v are absolutely continuous with respect to 
Lebesgue measure and that x G P satisfies the MCP. Then there is a unique 
y EY that splits the mass at x. 

Proof. Existence follows from Lemma I4.4t we must only show uniqueness. 
Suppose y < y x G Y both split the mass at x. For any x such that dc ^^ > 

9c ^' yo ^ and 9c ^' yi ^ < dc (*>y^ the Intermediate Value Theorem yields a y G 

ay ay ay j a 

[|/o,|/i] such that x G L x (y); hence, 



x 



dc(x,y ) dc(x,y ) } f dc(x, yx) dc(x, y x ) 

dy dy ' dy dy ' 



c 



U L ~M 

s/e[i/o,i/i] 



Therefore 



/«({ 



_ dc(x, y ) dc(x, y ) y , dc{x,y{) dc(x,y x ) } 

dy dy * dy dy ' 

< //( |J L s (y)) (5) 

2/G[yO:2/i] 

Now, absolute continuity of /1 and ^ together with the assumption that yo 
and 2/1 split the mass at x yield 

dc(x,y ) dc(x,y ), , dc(x,y x ) dc(x,y x ) 



x 



> a / (l K < ai / 



<% dy dy dy 

= v([yo,Vi}) (6) 
Combining (jHJ) and ([6]) and the MCP now yields a contradiction. □ 
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We are now ready to prove the main result of this section. 

Theorem 4.8. Suppose /x and v are absolutely continuous with respect to 
Lebesgue. Suppose that for all x,y G X x Y such that y splits the mass at 
x there exists an x G P fl L x (y) satisfying the MCP. Then for each x G X 
there is a unique y EY that splits the mass at x. Moreover, (x,y) G spt^) 
and (x,y) ^ spti^f) for all other y G Y. Therefore, the optimal map is well 
defined everywhere. 

Proof. For each x G X, by Lemma 14.41 we can choose y G Y that splits 
the mass at x\ the hypothesis then implies the existence of x G P fl L x (y) 
satisfying the MCP. Lemmas 14.71 and H~3l imply that (x, y) G spt^). 
We now show that 

(x, y') £ spt{j) for all y ^ y. (7) 

The proof is by contradiction; to this end, assume (x,y') G spt{j) for some 
y' ^ y. Suppose y' > y; choose y G (y,y'). By Lemma 1431 we can choose 
x such that y splits the mass at x. Now use the hypothesis of the theorem 
again to find x G PnL x (y) satisfying the MCP and note that (x,y) G sptipf). 

By Lemma W77\ x L~(ij), and so Lemma PO implies 5c ^'^ < 9c( £ y ^ ■ 
Therefore, 

dc(x, y) dc(x,y) 
dy ~ dy 

dc(x, y) 
dy 

But now (x,y'),(x,y) G sptfa) and y' > y contradicts Lemma 14.11 An 
analogous argument implies that we cannot have (x, y') G spti^f) for y' < y, 
completing the proof of ([7]). 

Now, note that we must have (x,y) G spt^) for some y G F and so the 
preceding argument implies (x, y) G spt^). 

Finally, we must show that there is no other y' EY which splits the mass 
at x; this follows immediately, as if there were such a y', an argument anal- 
ogous to the preceding one would imply that (x, y') G spt^), contradicting 

□ 
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Note that we can use Theorem 14.81 to derive a formula for the optimal 
map: 

F W := sup {„ : „({* : ^ < > ,([0. »))} 

Corollary 4.9. Under the assumptions of the preceding theorem, the optimal 
map is continuous on X . 

Proof. Choose Xk — > x G X and set y^ = F(xk); we need to show y^ — > F(x). 
Set y = lim sup^^ y^ G Y; by passing to a subsequence we can assume 
yk — >■ As spt(j) is closed by definition, we must have (x, y) G spt{^) and so 
Theorem 14.81 implies y = F(x). A similar argument implies lim inf y^ = 
F(x), completing the proof. □ 

The following example illustrates the implications of the preceding Corol- 
lary. 

Example 4.10. Let X be the quarter disk: 

X = {(xi, X2) : X\ > 0, £2 > 0, x\ + x\ < l} 

Let Y = (0, |) and take fi and v to be uniform measures on X and Y , 
respectively, scaled so that both have total mass 1. Let c(x,y) = —x\ cos(y) — 
X2sin(y); this is equivalent to the Euclidean distance between x and the point 
on the unit circle parametrized by the polar angle y. We claim that the 
optimal map takes the form F(x) = arctan(^); that is, each point x is 
mapped to the point p on the unit circle. Indeed, note that 

c{x,y) > -\Jx\ +x\ (8) 

with equality if and only if y = F(x), and that uniform measure on the graph 
(x, F(x)) projects to \x and v, implying the desired result. Now observe that F 
is discontinuous at (0, 0); in fact, ((0, 0), y) satisfies (TJ|) for all y EY so the 
optimal measure pairs the origin with every point. Note that the conditions 
of Theorem \4- 8\ fail in this case, as every y G Y splits the mass at (0,0) G X . 



Now suppose instead that v is uniform measure on [0, j], rescaled to have 
total mass 1. It is not hard to check that (0,2:2) is i> n P an d satisfies the 
MCP for all x<i . Now, for all (x, y) EY such that y splits the mass at x, it is 
straightforward to verify that we have some (0,x 2 ) G L x (y); hence, Corollary 
[3 implies continuity of the optimizer. 
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